Skip to content

fix(queue): treat 200 + non-REVALIDATED response as success#1272

Open
alex-all3dp wants to merge 1 commit into
opennextjs:mainfrom
alex-all3dp:fix/queue-treat-already-revalidated-as-success
Open

fix(queue): treat 200 + non-REVALIDATED response as success#1272
alex-all3dp wants to merge 1 commit into
opennextjs:mainfrom
alex-all3dp:fix/queue-treat-already-revalidated-as-success

Conversation

@alex-all3dp
Copy link
Copy Markdown
Contributor

@alex-all3dp alex-all3dp commented May 19, 2026

Problem

DOQueueHandler.executeRevalidation throws FatalError("…cannot be done. This error should never happen.") whenever the HEAD self-fetch returns 200 with an x-nextjs-cache value other than REVALIDATED. In real-world deployments serving meaningful traffic this happens routinely: when a user hits a stale page, the worker's in-isolate stale-while-revalidate regenerates the page in parallel with the queued revalidation. By the time the DO's HEAD self-fetch reaches the ISR handler, the page is already fresh in the incremental cache and Next.js returns HIT (or STALE) rather than REVALIDATED.

This produces noisy FatalError logs on healthy, correctly-serving pages. The desired outcome — fresh content in cache — has been achieved; the request just took a different path.

Reproduces under load with any throttling that delays the queued HEAD relative to the in-isolate SWR. Reported in #662; seen in production on opennextjs-cloudflare 1.19.9, Next.js 16.2.6 across deeply nested ISR routes (dynamic = 'error', revalidate in the 1-week–1-year range).

Fix

In packages/cloudflare/src/api/durable-objects/queue.ts, fall through to the existing success path when status === 200 regardless of the x-nextjs-cache value:

  • The sync table is updated (INSERT OR REPLACE INTO sync …).
  • The route is removed from routeInFailedState.
  • A debug log records the non-REVALIDATED status for observability.

The FatalError import is dropped (no other usage).

Test changes

The existing test "should not put it in failed state for an incorrect 200" (under failed revalidation) is replaced by "should treat 200 + non-REVALIDATED response as success (page already regenerated by another path)" under successful revalidation, which now also asserts the sync-table write.

All 22 tests in queue.spec.ts pass; full suite (328 tests) passes; pnpm code:checks clean.

Risk

Low. Behavior changes only for the previously-fatal branch. 200+REVALIDATED, 404, 500, and other non-200 responses are unchanged.

Closes #662


Open in Devin Review

The DO queue handler previously threw FatalError when the HEAD self-fetch
returned 200 with an x-nextjs-cache value other than REVALIDATED. This
commonly happens under load when in-isolate stale-while-revalidate has
already regenerated the page before the queued HEAD reaches the ISR
handler — the page is fresh in cache but Next.js returns HIT/STALE rather
than REVALIDATED.

Fall through to the success path (sync table updated, failed state
cleared) and log a debug message instead.
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 19, 2026

Open in StackBlitz

npm i https://pkg.pr.new/@opennextjs/cloudflare@1272

commit: e99b181

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

Copy link
Copy Markdown
Collaborator

@conico974 conico974 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex-all3dp Is it for PPR only ?

);
if (response.status === 200) {
const cacheStatus = response.headers.get("x-nextjs-cache");
if (cacheStatus !== "REVALIDATED") {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't treat STALE as a success that's for sure.
But even HIT should never happen (maybe in PPR, haven't tested that), we are supposed to remove the in isolate revalidation, if not it means there is a bug that we need to fix

Copy link
Copy Markdown
Contributor Author

@alex-all3dp alex-all3dp May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @conico974

Not PPR-only, we hit it on regular ISR routes (dynamic = 'error' revalidate from 1 week to 1 year), no cacheComponents/PPR enabled. Our open-next.config.ts is

const openNextConfig: OpenNextConfig = {
  ...defineCloudflareConfig({
    incrementalCache: withRegionalCache(r2IncrementalCache, {
      mode: 'long-lived',
    }),
    queue: queueCache(doQueue, {regionalCacheTtlSec: 30, waitForQueueAck: false}),
    tagCache: doShardedTagCache({
      baseShardSize: 4,
      regionalCache: true,
    }),
    cachePurge: purgeCache({type: 'durableObject'}),
  }),
  dangerous: {
    middlewareHeadersOverrideNextConfigHeaders: true,
  },
  cloudflare: {
    skewProtection: {
      enabled: true,
      maxNumberOfVersions: 10,
      maxVersionAgeDays: 20
    },
  },
}

You're right on both points though. If in-isolate SWR-triggered regen shouldn't exist alongside the queue's HEAD regen, then this PR would just paper over an underlying bug it seems.

How do ypu prefer to handle this PR? Closing it would be reasonable. Alternatively narrow it to handle HIT only, drop STALE, add a guard rail ((e.g. only treat HIT as success when the cache freshness is verifiable) and mark the change as a defense for potential race conditions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] ISR with Partial Prerendering fails at runtime on Cloudflare R2

2 participants